EXTREME: an online EM algorithm for motif discovery
نویسندگان
چکیده
منابع مشابه
EXTREME: an online EM algorithm for motif discovery
MOTIVATION Identifying regulatory elements is a fundamental problem in the field of gene transcription. Motif discovery-the task of identifying the sequence preference of transcription factor proteins, which bind to these elements-is an important step in this challenge. MEME is a popular motif discovery algorithm. Unfortunately, MEME's running time scales poorly with the size of the dataset. Ex...
متن کاملAn Efficient Algorithm for String Motif Discovery
Finding common patterns, motifs, in a set of DNA sequences is an important problem in bioinformatics. One common representation of motifs is a string with symbols A, C, G, T and N where N stands for the wildcard symbol. In this paper, we introduce a more general motif discovery problem without any weaknesses of the Planted (l,d)-Motif Problem and also a set of control sequences as an additional...
متن کاملAn Online Hierarchical Algorithm for Extreme Clustering
Many modern clustering methods scale well to a large number of data items, N , but not to a large number of clusters, K. This paper introduces PERCH, a new non-greedy algorithm for online hierarchical clustering that scales to both massive N and K—a problem setting we term extreme clustering. Our algorithm efficiently routes new data points to the leaves of an incrementally-built tree. Motivate...
متن کاملAn Online EM Algorithm Using Component Reduction
The EM algorithm has been widely used in many learning or statistical tasks. However, since it requires multiple database scans, applying the EM algorithm to data streams is not straight forward. In this paper we propose an online EM algorithm which can deal with data streams. The algorithm utilizes a component reduction technique which reduces the number of components in a mixture model. A not...
متن کاملDevelopment of an Efficient Hybrid Method for Motif Discovery in DNA Sequences
This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bioinformatics
سال: 2014
ISSN: 1460-2059,1367-4803
DOI: 10.1093/bioinformatics/btu093